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ABSTRACT 



Two statistical prediction models for the 24-hour surface pressure 
change are developed. One model employs the terms in a dynamic 
model as the independent variables in a linear regression equation. 

The other model combines these variables with parameters capable of 
reflecting the long-wave, long-term influences in a multivariate 
discriminate analysis. The regression equations were developed from 
data taken from the month of November 1962 at 50N latitude. A 
discussion of the results of both methods is presented along with a 
critique of the procedures used in obtaining the data. 

The writers wish to express their appreciation for the guidance and 
encouragement given them by Professor Frank L. Martin of the 
U. S. Naval Postgraduate School in this investigation. 

They are also indebted to Lieutenant Commander Mildred J. Frawley, 
United States Navy, of the Fleet Numerical Weather Facility for 
assistance in computer programming. 
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1. Introduction 

Numerical techniques for the sea-level pressure prognosis have 
not shared the same success as those used for forecasting the 500-mb 
contour map. Methods presently in operational use consist in general 
of a 500-mb prognosis coupled with prognosis of thickness using some 
form of the first law of thermodynamics. Methods similar in nature 



In addition, it has become standard practice with such units as the 
U. S. Navy Fleet Numerical Weather Facility, Monterey, California, 
(FNWF) to superimpose certain empirical corrections on the location 
and intensity of cyclones and anticyclones. • The inadequacy of 
numerical methods to predict cyclogenesis and anticyclogenesis is 
perhaps the greatest deficiency of the presently operational numerical 
techniques. This study was undertaken to help eliminate some of 
these problems. 

Two models were developed: (1) a model using solely dynamic 
parameters in the form of a multiple linear regression equation; and 
(2) a stepwise multivariate regression analysis using the "dynamic 11 
predictors as well as certain other parameters selected to reveal 
lpng -term and long-wave influences on the surface pressure change. 
For convenience the two models are hereafter referred to as "the 
dynamic model" and "the statistical model", respectively. Separate 
and detailed descriptions of the two models are presented in the 
following sections. 



have been suggested by Haltiner and Hesse 
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Latitude 50N in early winter months was chosen as a latitude 
offering typical, if not difficult, forecast problems for the investigation. 
The dependent data were taken, insofar as possible, from 30 consecutive 
days beginning in late October, 19&2 Early December of the same 
year provided five days of independent data. Twenty-four geographical 
locations, at intervals of 15 degrees of longitude around the latitude 
belt were chosen as "stations'*. The "stations" were arbitrarily 
numbered from 1 to 24, starting at ocean station "Papa" in the Gulf of 
Alaska, in an eastward direction. 

The data used in both models were taken from numerical computa- 
tions and printout charts prepared by use of the CDC 1604 electronic 
digital computer. The programs and data tapes necessary to compute 
and print the charts were supplied by FNWF. All statistical computa- 
tions were made on the same computer using selected programs from 
the BIMD^ Fortran library. 

The techniques employed in this work were not designed to supplant 
the numerical methods now in use. This effort was conducted so as 
to illuminate some of the factors not now considered that may be in- 
fluential in causing a sea-level pressure change. Time steps of 
twenty-four hours were attempted in connection with both models in 

*A compilation of statistical electronic computer programs that were 
compiled and edited by the Biology and Medical Department of the 
University of California, Los Angeles, This manual is distributed 
through the UCLA campus bookstore. 
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order to test the efficacy of this longer-than-normal time step in 



conjunction with statistical methods. 
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2 A derivation of the dynamical prediction equation 

The following development is taken, after Reed [ 1962 ] , with some 
modifications and simplifications in the later stages of the derivation 
Pertinent remarks will indicate where these modifications are 
applicable . 

The frictionless vorticity equation for the 1000-mb level may be 
well approximated by 



^(X+-f)=-V* vCX+f) + f • 



(i) 



If we assume a parabolic vertical velocity profile between the 
surface and 500 mb (subscript 5) of the form 



(6O5 — CO©) 


1 - (P- P 5 f 




_ vp 0 - Ps ;j 



(2) 



and substitute for 



U p )o 



in ( 1 ) , we obtain 



ia+f)=-v s 'va+f) -2S- Cco 5 -^o) . 

** P>-Fs 

The geostrophic wind is used to approximate the vorticity and, 
following Reed, the 1000-mb contour pattern may be regarded as 
consisting of a set of equally- spaced circular highs and lows of the 
form 



(3) 



x5iK.^r«j ( 4 ) 

superimposed on a constant zonal current U. Here x and y are the 
eastward and northward distance elements, respectively, and the 
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absolute constant resulting from the integration of U along the meridian 



is taken as zero. Then the 1000-mb relative vorticity becomes 



X = 1 v 2 z? 0 = "Tu? 9 [2° + . 

Substituting (5) into (3) yields the equation 




The particular form of the adiabatic thermodynamic equation 
used here is 



<L -V-Fsb -0-6O, 

ftl sf ) V V 5p ' 3&3Tf> 171 

With the assumption of a linear geostrophic-wind hodograph in the 
layer between 1000 mb and 500 mb, equation (7) may be integrated 
with respect to p from 1000 to 500 mb to give 



sL ( is-Zj = - V e -V (Zs-Zj- (Ztor+o).) . 

j-t 5 



Next multiply (8) by 



(8) 



Jk'= z 



(9) 
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a slowly varying parameter which will be regarded here as constant, 
and add the modified version of (8) to (6) giving 






(10) 



Use of the kinematic boundary condition allots the 

vertical velocity COq to be included within both brackets of (10) as a 
terrain effect. This last effect is not considered in this paper, since 
it is not one of the dynamic factors explicitly selected for the statis- 
tical regression employed in this study. Hence we are left with the 
result 



tefc-V'Ji-YWfcZo+H+JL' (2s-£,)] (n) 



as the prediction equation. 

Equation (11) is now rearranged to give 

j r M Is ~ ( /+ &')£e + M j ~ * V i$ ~i l +-&') Zoi H J (12) 

a jjk is ~io + H] ~ * +tt] 

(13) 



We next make use of the principle^first introduced by Fjortoft 
/ 1952* and extended by Reed 1962^ , of employing an equivalent 

advecting wind which gives the same instantaneous advection as 
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but which has the property of changing more slowly with time. This 



concept has proved to be particularly valuable in graphical integration 
of the dynamical equations where long time steps are employed. 

Thus (13) may be written in the equivalent form 

4 fJk l 5 - 2.+ H] = -V e .V {Jt ? 5 -?o+ Hj (M) 

or, alternatively as 

■o - V E .V [k ^o+U] -f J (15) 

where 

V 0 =flU|F2,, V E = Ifex g <»> 
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3. Simplification of the dynamic model 

In Reed’s model the scalar field of whose gradient defines v e 
has a term (-M) representative of terrain effects in addition to those 
in equation (16) . As already noted, in approaching the statistical 
application of the dynamic model we have set M=0 

A word of discussion regarding Reed's function G and our function 
H is appropriate. From equatiors(6) and (9), G is expressible as 



Gj — G) Jz') - ;l _ ) j l — - Q*_Cn£j $ 2 

Sr(^ / !\J 



Reed uses a mean value of k=0.55 or k=1.22 for wave number N=6 
in connection with his 12-hr Lagrangian prediction technique. It may 
be shown that G- •'I ^ 5 1 and hence G 



is an increasing 

function of latitude up to = 45N. For 45 ^ 55 degrees which 

is typical of the range of latitude encountered in this study, G is a 

slowly decreasing function. The mean northward gradient of the G- 

field centered at latitude SON over a 10-degree span, is such that 

3 knots in an eastward direction. 

Recall that our function H is given by G" % and using 

i "t" 

(l+ic)=2 22 it follows that the term involving U is equivalent to a 
mean maximum zonal wind (according to Namias and Clapp fl95l] ) 
of 2 knots. In lower latitudes the effects of the G and U terms are of 
opposite algebraic sign whereas in the latitude belt of this discussion 



8 



the gradient of H is equivalent to an eastward wind of 5 knots. Con- 
sequently the H-field in (16) has been neglected relative to kz^. in the 
Zg field. With this simplification the prediction equation (15) becomes 



sL? 0 

"t 



V p . vk + $(JkZ s ) 



(17) 



since h*z_ - z 
5 o 

The last term in (17) may be obtained by employing the 

graphical prediction technique for the barotropic model as described 
in Haltiner and Martin [1957 , pp. 395-398J . if represents the 
space-mean 500-mb height, the Fjortoft method leads to the result 



at the level of nondivergence (assumed to be 500 mb). Fjortoft's 
space-mean advecting wind ^2+T is given by 



(18) 



V= = akxVd+J) . < 19 > 

4 

We therefore have the familiar result 

J ) ~ • (20) 

C )t 2+J 

The Fjortoft graphical treatment makes use of the following function 
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Petterssen jj.956, pp 392 ] gives the distribution of the J-field 
based upon a grid mesh d=1000 km At latitude 50N, the gradient of 
J which he shows is equivalent to an easterly geostrophic wind of 
8c 5 knots. In regions of maximum advection of (z-z+J), it appears 
reasonable on the basis of a statistical approach to approximate the 
500 -mb advecting space-mean wind by Vgj where ^2 is 

the space-mean geostrophic wind at 500 mb. Moreover, since our 
"stations 11 are far apart and confined to the 50N latitude circle, the 
working hypothesis made in the following statistical analysis is that 
gives only "feedback" contributions to the 
For simplicity the 1000-mb height change resulting from (15) and 
(18) becomes 



<3 = V,. Vh. + A. V* •' V (2? - Z 5 + J ) • 






( 21 ) 



Note that since tyg — it follows that for 24-hr advection it is 

appropriate to consider the first term on the right side of (21) as a 
reduced advection. Furthermore the vector A Vi in (21) has also 
been treated as the reduced advecting wind \[/- , since an inter- 

pretation of this kind has been found useful in "advecting" the movement 
of rain areas associated with progressive sea-level cyclones by 
Renard [l959] . We have noted that Reed gives the value k=0 55 

as appropriate to a Lagrangian forecast technique with 12~hr time 
increments. In our computations k was rounded off to 0.5 in view of 
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the subjectivity of hand measurements of the space-mean geostrophic 



wind . Hence with these simplifications our prediction model 

now becomes 

J_£ 0 = A„ + A l [~V E .VhJ+A 1 f-V e .Vf£ s -z s +JjJ > 



Here the regression coefficients A , A . A^ are introduced as un- 

o 1 2 

knowns in order to absorb any statistically-determined feedback 
relationships contained in the preceeding analysis, such as for example, 
the assumption 

In equation (22) the best-fit determination of A , A , and A will 

O J. £•> 

be made by least squares. In essence, however, this equation presents 
a dynamically formulated problem similar to that of Reed jj-956j and 



Haltiner and Hesse 



[l958] . 
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4. Computational procedures for the dynamic model 

The local change of the 1000-mb height in equation (22) is written for 
a 24-hr time step and is considered to be strictly proportional to the 
24-hr change of sea-level pressure, Zhp 0 > which serves as the dependent 
variable. The advecting terms in the working equation (22) are then 
used as the independent variables in a multiple linear regression 
equation of the form 



The advecting terms were evaluated by a quasi- Lagrangian technique 
using fixed-point computations at points determined by trajectory 
tracing. At each of the 24 stations for 30 days an upwind point was 
determined. This was accomplished by specifying a geostrophic wind 
from the 500-mb space-mean height field terminating at the station in 
question and then tracing the upwind trajectory in the contour channel 
for a distance corresponding to 24 hours. In the cases of difluent 
and confluent contours 6 -hour steps on the initial chart were used so 
that more representative space-mean winds were available at each step. 

The parameters z and J as used in the vorticity term were computed 
using a square grid distance of 782 km. The space-mean 500-mb height 
field used to determine the advecting wind was obtained by subjecting 
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the 500-mb heights to four scans using a "smoother" of the type 

@ + v 2 Ay) T 



( 24 ) 



where subscripts S and I refer to smoothed and initial values, 
respectively. 

As previously noted, k , was taken to be 0.5 and the 

wind speed in all cases was reduced by this factor . With this choice 

of advecting wind, the upwind point thus determined for fields of both 

h and ( 2 ^-^+J) was the same for both advection computations. The 

difference of the values of h and (z ~z +J) from their initial values 

5 5 

over the station was recorded as the 24-hr advective change. 

FNWF data tapes served to give printouts of the entire fields 
for all 24-hr forecast periods under consideration Values of Ap 0 
at the stations along 50N were then obtained by bilinear interpolation. 
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5. Results for the dynamic model 

The BIMD 06 program of the BIMD library was used to perform the 
least- squares regression analysis based upon the use of equation (23) 
for the dynamic model. This was accomplished for each of the three 
data stratifications shown in Table 1. Relevant statistics obtained by 
this analysis are displayed in Table 2. Here PRV stands for per cent 
reduction of variance, while R is the multiple correlation coefficient. 

After analyzing the results obtained by stratifying the data it was 
decided that due to the greater percent reduction in variance given by 
data from the fifteen stations over or near the land areas that this 
group would be used to develop the final regression equation The 
resulting equation was (land areas only): 



Ap a = 6.5S-.o4fV E -V^-l.8(-V E Vh) . (25 ) 

Note that the regression coefficients are negative indicating that 
advection of higher values of and of h each cause a negative 
contribution to the pressure change. This agrees with usual synoptic 
observations 

Since the assumption of a linear relationship between the predictors 
and the predictand was not necessarily valid, scattergrams of Ap e 
versus both independent variables were examined in order to investigate 
possible indications of a preferred relationship. The BIMD 27 program 
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as adapted to the CDC 1604 computer was employed to plot these scatter- 
grams. The results using thickness and absolute vorticity advection, 



respectively, are shown in figures 1 and 2. Examination of these figures 



Note that the percent reduction in variance attributed to the partial 
correlation of the vorticity advection is insignificant, indicating that 
the thickness-advection parameter will give equally good results when 
used alone as a predictor. This was not due to a significant correlation 



correlation coefficient was only 0.17. 

The regression equation developed from these parameters was tested 
on an independent sample of 75 cases drawn from data gathered in the 
month of December with the results shown in Table 2b. 

The apparent lack of success of the 24 -hour absolute vorticity 
advection, as measured by the method here employed, to furnish 
significant predictability for the subsequent 24-hour pressure change 



shows that while a linear regression between 

is reasonably valid, the relationship between Ap<j 
2 

appears to be non-linear with no obvious correlation. 





The simple correlation coefficients found between 





were -.41 and - .04 respectively (see also Table 2a). 




since their simple 



2 




15 



was surprising, particularly in view of the comparative usefulness of 



the thickness advection. Some of the primary reasons for these 
differences in the statistical behavior of these two predictors were 
revealed upon a critical re-examination of the initial-data fields. These 
are discussed in subsections (a), (b) and (c), below, with some re- 
sulting conclusions in (d), i 

(a) Advection of 500 mb absolute vorticity . The values of this variable 

are sensitively dependent upon the field of z^-z^+J. This field had the 

characteristic of exhibiting unusually strong gradients over short 

distances near the vorticity centers, with relatively weak gradients 

elsewhere Thus any cumulative error in constructing a 24-hour upwind 

trajectory, based on the use of the equivalent-advecting wind 

can give rise to sizeable differences in the value of z -z +J to be 

o 5 

advected. Reed [l962j has also referred to the question of trajectory 
accuracy . 

(b) Advection of thickness. The printed. fields of h = z -z displayed 
a smaller degree of non-linearity over short distances and were sub- 
ject to considerably smaller upwind-point error The difference in 

appearance of the z -z+J and z -z fields may be attributable to 

5 5 5 o 

the smoothing process used in obtaining z^ whereas no smoothing was 
employed in obtaining the thickness field, 

(c) Approximation for advection of absolute vorticity 

In a small percentage of cases of 24-hour 500-mb advection the value 
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of 



was not equal to that using the 



± CV| • V (Zg-Zf-tjQ 

advecting wind ' This occurred when a 24-hour 

trajectory passed over an extreme value of z ~z +J 

b b 

(d) It must be concluded that the statistical Lagrangian technique 
employed here suffers from the defect of employing excessive time- 
steps. While the procedure bears some similarity to the Fjortoft 
technique, it does not have the advantage of scanning comparative data 
from all latitudes within the grid map Hence the smoothing capability 
usually available in most prognostic procedures (and in analysis in 
general) could not be used as a prognostic aid here 
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6. A multivariate linear regression analysis for 24-hour prediction 
of sea-level pressure 



In this phase of the investigation a total of 28 independent variables 
were tested as possible predictors in a purely statistical approach. 



others employing statistical reduction methods have utilized the 
advantages of a computer to reduce large numbers of possible predictors 
to a significant few. Such methods may use either obj ectively- determined 
data with no immediate rationale for the relationship, or dynamically- 
based data to select variables. Such significant multiple linear re- 
gressions which exist are found by a statistical screening process 
Once an objectively chosen variable has been selected through the 
screening process it is usually possible to find a causal relationship 
between it and the predictand on the basis of synoptic- dynamic 
considerations 

In this section, the intent was to utilize the two predictors already 
employed in the dynamic model (see equation (23)), Since the predictors 
already chosen rest heavily on 500-mb parameters it was decided to 
choose, as far as possible, additional obj ective parameter s from this 
level. Furthermore, since the month of November 1962 was charac- 
terized by contrasting mid-latitude regimes in the Pacific and Atlantic 
Oceans, with higher than normal mid-latitude zonal flow in the Pacific 
and anomalous blocking action in the Atlantic, it was felt that the 



Recent investigations by Ostby-Viegas , Miller 





, and 
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24-hour values of the dynamic (advective) parameters might contain 



considerable scatter at stations influenced by such contrasting meteoro- 
logical activity. Consequently it was decided to select a limited number 
of 500-mb parameters indicative of the long wave features and, to some 
degree, of the extended- period circulation anomalies. Recourse is 
made here to some of the extended-forecast concepts of Namias [1951] . 
At each station the latest 500-mb height and those for each of the 
three preceeding 24-hour map times were read off or interpolated from 
the contoured printout charts. From these data the two sets of para- 
meters, the two-day trend Ti and three-day mean Zi were computed 
for each station, and a regression equation of the form 




is sought, where j is a 15° longitude interval, considered positive 

eastward. The subscript i - 2j , indicates that we are examining data 

o 

at all upwind points 30 longitude apart. Note that the two-day trend 
at station i is given by 



Tl -C2 c -Z.J, 



(27) 



and 
mean 



' for simplicity has been defined here as the simple arithmetic 

!Z^ = ( + 2_, 4 - + z- 3 ) / 4 . 



(28) 
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The implication of the summation sign before the T^-zJ and 
terms is that we are considering all of the large-scale upwind and down- 
wind rates of change and mean heights, in contributing predictability 
to A p 0 at the site denoted by the subscript c • 

The variables l*i and are the 850-mb zonal and meridional 
geostrophic-wind components at station C . They are presented in 
equation (23), firstly, in order to include possible relevant low-level 
effects, and secondly, in order that a physically important factor such 
as U.V (where the superior bar indicates a zonal mean) may at least 
be implicit within the set of independent variables. The significance 
of this cross- covariance is that it suggests effects of the zonal-index 
cycle (see Haltiner and Martin, [l957, pp. 446-448] ) . 

With 28 possible predictors appearing in (23), the BIMD 09 program 
was used to perform the regression analysis of the statistical model. 
This program is a modification of one originally written by M. A, 
Efroymson [ 1955 ] of the Esso Research and Engineering Company. 

The important features of the program include a stepwise screening 
of the predictors using arbitrary upper and lower critical F- values 
as cutoff limits for inclusion or rejection of variables. 

The F statistic as employed in this test is the ratio of the mean 
squares explained by the regression to the residual or unexplained 
mean squares. According to Anderson £l96oj the F-ratio is the ratio 
of two chi-square distributed variables with k and n-k-1 degrees of 
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freedom, where n is the number of cases in the data sample. Miller 
[>6 2 ] suggests values for the critical or cutoff F- values for introducing 
predictors into (23). If P is the total number of possible predictors 
(28 in this investigation) and k is the number of predictors already 
selected, his critical F-value is given as 



\ ~/P< 

p-jk+t 



(29) 



where ^ •— . The value^K^fr = .05 is the usual critical 

P-Jfe+I 

significance level in the selection test. It is apparent that the level 
imposed by this method of determining a critical F-value will decrease 
as more predictors are chosen. 

Inasmuch as the BIMD 09 screening program uses a fixed F-level 
throughout the screening process, it seems desirable to perform several 
regression analyses of the data with differing F -levels but retaining one 
coinciding with Miller’s (F = 10.0). In a recent analysis (Martin et al. , 
[l963}) the recommendation is made that a lower F-level for rejection of 
a previously selected variable should be taken as zero when using the 
BIMD 09 program with Miller's selectioncriterion Accordingly, 
predictors were arbitrarily selected in this analysis with upper critical 
F-levels of 10 . 0 and 5.0, and a lower limit in each case of zero. 

Selected parameters are shown in Table 3 in the order in which they 
were chosen by the analysis of the dependent data. The form of the F- 
test used to compute the significance of the final regression equation 
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is given (after Anderson, £ 1960 , P. 89 J ) as below: 






( 30 ) 



where R is the multiple correlation coefficient. The percent reduction 



in variance, R , is given by the formula 



R x = 1-fS a 



z 



(31) 



5 , 






where Oy the reduced standard error, and , the total standard 

error, are available at each step of the BIMD 09 program printout. 

The cumulative percent reduction in variance was computed for 
each step of the regression (see Table 3). Note that using Miller's 
suggested initial F-level of 10.0 only two parameters are chosen by the 
screening process. These significant predictors are -%-Vk. 
the advection of thickness as employed in the dynamic model, and 
the 2-day 500-mb height change over the station. Together, these 
parameters give a cumulative percent reduction in variance of 18.43. 
When the F-level is lowered to 5 . 0 four additional parameters are 
selected, each of which contributes a relatively small gain in PRV. 

The four additional parameters selected (in the order chosen) were 
-%-Vn or the advection of z^-z^+ J as employed in the dynamic 
model; , the geostrophic component of the 850-mb wind over the 

station; }i-£ , the 2-day 500-mb height trend 90° upstream; 
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and L ~j a , the 3-day mean 500-mb height 150° upstream. The 



resulting regression equation which includes the six selected variables 
(see Table 3) is 



While the computed F-levels indicate that these last predictors are 
statistically significant it is possible that such correlations as are 
indicated arise from "noise" and/or erroneous data. Under such circum- 
stances the regression equation may tend to "overfit" the sample. A 
discussion of this effect is given by Panofsky and Brier 



The existence of M overfitting n of the dependent sample is demonstrated 
by the instability of the regression equation when applied to the independent 
data. This phenomenon may evidence itself by a substantial decrease 
("shrinkage") in the percent reduction of variance explained with the in- 
dependent sample. For example, when equation (32) was tested on an 
independent sample of 75 cases, a PRV of 10 6 percent occurred, compared 
to 22.04 percent for the dependent sample. Although the' " shrinkage" here 
is large, some stability of the final equation is indicated and an improve- 
ment shown over the dynamic model. 

From practical considerations we wish to use only the most efficient 
predictors. Those which offer little improvement in predictability (by 
the added percent reduction in variance criterion) are chiefly of 
theoretical interest. Undoubtedly the most practical prediction 




(32) 
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equation giving the least overfit would involve the two parameters 
selected in the statistical model with P ^ i O , namely, advection of 
thickness h and the 2 -day height difference I L . The 

usefulness of those predictors with lower F-levels is doubtful, especially 
when the small percent reduction in variance achieved by their use is 
considered. 

Errors in data sampling have already been referred to in section 6, 
particularly in reference to the dynamic predictors. Other errors are 
map- scale error and interpolation errors, both of which are considered 
to be negligible in this instance. 
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7 Conclusions 



The prediction equations developed by this investigation do not offer 
a significant improvement to existing methods. This is felt to be due 
partly to the limitations inherent in applying the statistical technique as 
a linear operator and partly attributable to the fact that single-point 
observations fail to capture gradient effects in the manner of a closely 
spaced grid. Considering that the chosen latitude and season indicate 
that baroclinic development can normally be expected, any useful 
filter should predict non-linear effects in a consistent fashion. The 
results obtained by applying the developed regression equations to the 
independent samples indicate that this is not being done and that we 
must recognize some of the shortcomings of the measurement methods. 
The most immediate improvements suggested from the results are: 

(1) decreasing the Lagrangian time step; and (2) employing an entire 
grid map to verify actual Ap 0 patterns. 
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Table -1, Stratification of data for analysis of the dynamic model 



Grouping 


Number of 
stations in 
sample 


Population size 


Individual 






stations 


i 


30 


Continental 






stations 


15 


450 


Ocean stations 


9 


270 


Total of stations 


24 j 


720 
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Table 2. 



Summary of pertinent statistics for various sample 
stratifications for the dynamic model 



(a) Dependent data 



Grouping 


Sample 

size 


Variable 


PRV 


R 


Final F 
value 


Total 

stations 


720 


-V E - Vh 

combined 


.002 

.144 

.146 


. 382 


61. 31 


Land 

stations 


450 


-Ve- 

-v E • v h 

combined 


.002 

.179 

.181 


.426 


49.64 


Ocean 

stations 


270 


i i 

m m 

<k k 


.004 

.130 


. 366 


20.62 



combined .134 



(b) 


Independent data 






Grouping 


Sample Variable 

size 


PRV 


R Final F 

value 


Land 


stations 


75 combined 


.01 


.10 <1 
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Table 3 Summary of results obtained from dependent data for the 
statistical model 



Predictor 


F level 
on entry 


Percent 
reduction in 
variance 


Coefficient 
in regression 
equation 
( F^ 5 0) 


- V e ■ V h 


90.6 


16 . 65 


-1.6 


Ti 


10.8 


1.78 


-.28 


-V £ 


6.9 


1.06 


-.18 


UL 


6.5 


• 98 


-.54 


TIo fe 

(90° upstream) 


5.9 


.86 


-.24 


2*/ (.-1 e 

(150° upstream) 


5.0 


.71 


-.14 


Constant 

term 






116.1 



Standard deviation of , s^ 

R2 

R 

F(6 , 443) 
F c ( .99) 



7 . 484 mb 

0.22 

0.48 

20.9 

2.85 
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